Multinomial Logit Contextual Bandits: Provable Optimality and Practicality

نویسندگان

چکیده

We consider a sequential assortment selection problem where the user choice is given by multinomial logit (MNL) model whose parameters are unknown. In each period, learning agent observes d-dimensional contextual information about and N available items, offers an of size K to user, bandit feedback item chosen from assortment. propose upper confidence bound based algorithms for this MNL bandit. The first algorithm simple practical method that achieves O(d√T) regret over T rounds. Next, we second which O(√dT) regret. This matches lower problem, up logarithmic terms, improves on best-known result √d factor. To establish sharper bound, present non-asymptotic maximum likelihood estimator may be independent interest as its own theoretical contribution. then revisit simpler, significantly more practical, show variant optimal broad class important applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Generalized Multinomial Logit Model

The so-called “mixed” or “heterogeneous” multinomial logit (MIXL) model has become popular in a number of fields, especially Marketing, Health Economics and Industrial Organization. In most applications of the model, the vector of consumer utility weights on product attributes is assumed to have a multivariate normal (MVN) distribution in the population. Thus, some consumers care more about som...

متن کامل

Variational Multinomial Logit Gaussian Process

Gaussian process prior with an appropriate likelihood function is a flexible non-parametric model for a variety of learning tasks. One important and standard task is multi-class classification, which is the categorization of an item into one of several fixed classes. A usual likelihood function for this is the multinomial logistic likelihood function. However, exact inference with this model ha...

متن کامل

Multinomial logit random effects models

This article presents a general approach for logit random effects modelling of clustered ordinal and nominal responses. We review multinomial logit random effects models in a unified form as multivariate generalized linear mixed models. Maximum likelihood estimation utilizes adaptive Gauss–Hermite quadrature within a quasi-Newton maximization algorithm. For cases in which this is computationall...

متن کامل

Semantic Scene Segmentation using Random Multinomial Logit

We introduce Random Multinomial Logit (RML), a general multi-class classifier based on an ensemble of multinomial logistic regression models, and apply it to the task of semantic image segmentation. The algorithm is simple, can be trained efficiently, and has near realtime runtime performance. RML combines the desirable properties of multinomial logistic regression, being stable and theoretical...

متن کامل

Kernalized Collaborative Contextual Bandits

We tackle the problem of recommending products in the online recommendation scenario, which occurs many times in real applications. The most famous and explored instances are news recommendations and advertisements. In this work we propose an extension to the state of the art Bandit models to not only take care of different users’ interactions, but also to go beyond the linearity assumption of ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i10.17111